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1. Introduction 

As we enter the era of the 13 TeV LHC, it is increasingly important that parton distribution 
functions and their uncertainties are well understood. Already for a number of key LHC measure¬ 
ments PDF uncertainties are of equivalent size to the experimental uncertainties [1], and this situa¬ 
tion will get worse as more data is collected, and as the size of other theory uncertainties (e.g. from 
the scale) are reduced. To face this challenge, new global determinations of PDFs have recently 
been performed: NNPDF3.0 [2], MMHT2014 [3], and the upcoming CT14. Due to improvements 
in the methodology and theory used by the various fitting groups, discrepancies between their re¬ 
sults highlighted previous benchmarking exercises [4] have reduced, resulting in many cases in 
improved agreement between the new sets. This better agreement has also led to the development 
of combined PDF sets [5]. However, it is still important to test statistically that the methodologies 
used to perform the PDF fits are valid, and produce a result which is unbiased. 

As part of the NNPDF3.0 analysis, we performed a large number of closure tests on our fitting 
methodology. As described later in this contribution, these tests involved fitting a set of pseudo-data 
generated from a chosen PDF set, with the aim of assessing whether fit successfully reproduced 
the supplied underlying law, up to statistical fluctuations. Results of many of these closure tests 
were presented in the main NNPDF3.0 paper [2]; in this proceedings I will present a number of 
additional results, specifically of closure fesfs using reduced dafasefs, and for LHC observables. 

2. NNPDF3.0 

NNPDF3.0, our lafesf PDF deferminafion, was released in Ocfober 2014 [2] and feafures new 
dafa from fhe LHC and HERA, and a complefely updafed analysis code and mefhodology. Amongsf 
fhe new dafa included were HERA-II sfrucfure funclion dafa from bofh HI and ZEUS, and HERA 
combined charm production cross-secfion dafa. Adding fo fhe EHC dafa already included in fhe 
NNPDE2.3 analysis [6], NNPDE3.0 infroduced a large amounf of fhe released EHC run-I dafa, 
including many new processes imporfanf fo PDEs. We added fhe ATEAS 2.76 TeV inclusive jef 
dafa wifh sysfemafics fully correlated fo fhe previously included 7 TeV ATEAS sef, which increases 
fhe impacf of bofh sefs in fhe fif. Also new were fhe W + c dafa from CMS, which provide imporfanf 
information on fhe sfrange PDEs, and dafa on fhe fofal fop pair producfion cross-secfion from bofh 
ATEAS and CMS, making use of fhe recenf resulf for fhe full NNEO calculafion [7]. 

Eor fhis new deferminafion we also renovafed our filling mefhodology. The NNPDE3.0 fils 
were performed using a complefely new code, written in C-i-i- and optimised for fhe compulalional 
inlensive hadronic calculations required for our fils. The genelic algorilhm we use fo perform fhe 
minimisalion was updated, wifh a new mulalion approach which makes use of fhe sfrucfure of fhe 
neural nefworks fo improve bofh tiffing speed and resulls. We extended fhe number and kinemafic 
range of posilivily observables we used in fhe fils, in order fo better conslrain unphysical negalivify 
in fhe PDEs. As I will discuss furlher in Secfion 4, NNPDE3.0 also feafures an improved form of 
cross-validafion, which prevenls over-learning in fhe neural nefworks buf wifh a reduced chance of 
under-learning due fo premalure slopping of fhe til. 

The NNPDE3.0 sefs are available on LHAPDF [8], wifh determinations af EO, NEO and 
NNEO for multiple values of as, and also for a number of reduced dafasefs. More delails aboul fhe 
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new data included and the new methodological features are available in the paper [2]. 

3. Closure testing 

One of the central features of the NNPDF3.0, which was also central to the development of 
the new methodology, is the use of closure tests. The basic idea of these tests is to perform a 
fit where we know the underlying ‘correct’ answer, and so allowing us to directly evaluate how 
accurate our fits are. This technique was used in NNPDF3.0 both to validate our results and to test 
improvements to the methodology. 

In order to perform a closure test, first a set of pseudo-data is generated using a chosen input 
PDF set. The pseudo-data we used was based on the NNPDF3.0 dataset, using the same covariance 
matrix but with central values based on the theory value from the PDF set, fluctuated according to 
the experimental uncertainties. Different central values can be obtained by using different random 
seeds in this process, so the closure test is repeatable. There are also a number of different options 
at this stage. For instance, a variation on the standard closure tests can be done where the pseudo¬ 
data central values are generated without statistical fluctuations, i.e. are set as the pure theory value 
from the input PDFs. This provides an environment to test the neural network minimisation where 
over-learning is impossible, and features can be evaluate purely on the goodness-of-fit they obtain. 
Once the pseudo-data is generated with the chosen settings, a PDF fit can be performed with it in 
exactly the same way as with the real experimental data. 

Given how the pseudo-data is generated, it is automatically perfectly consistent with the theory 
used to generate it, and so the precise theoretical choices (perturbative order, quark mass thresholds, 
value of as) do not affect the results as long as the same settings are used in both cases. For the 
closure test presented here and in [2], we use the default NNPDF3.0 NLO settings. One related 
issue is the use of positivity constraints in the closure test fits. If these are not satisfied by the input 
PDFs, including them in the fit could introduce some tension between them and the pseudo-data. 
On this basis we do not include the positivity constraints where this is the case. 

4. Cross validation 

One key improvement in the NNPDF methodology introduced for NNPDF3.0 is improved 
cross-validation. The idea of cross-validation is simple: in order to prevent the neural networks 
from over-learning, split the dataset into two halves, training the networks with one while mon¬ 
itoring the quality of the fit to the other, validation, set. While the is improving to both the 
training and validation sets, this means that the neural network is correctly learning the underlying 
law, while when it improves for the training set and deteriorates for the validation set, the neural 
networks are overfitting. In previous analyses, we have monitored the validation x^ during the 
minimisation and stopped the fit when it started to increase. However, the validation x^ is subject 
to a substantial amount of noise, and this approach could lead to the fit being stopped too earlier, re¬ 
sulting in under-learning. This was the case even when approaches to reduce the noise, for example 
smoothing the x'^ over several generations, were used. 

The new approach does not attempt to stop the fit, and instead allows it to continue for a preset 
large number of generations. The best generation, in terms of over- and under-learning, is then 
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Cross-validated vs Fixed length Distances 

Central Value Central Value 




Figure 1: PDF distances between closure test fits with and without cross-validation. The closure test fits 
were performed using pseudo-data based on MSTW2008. 


selected from all seen during the fit as the generation with the lowest validation This technique 
therefore prevent overfitting by taking as the final result the set of networks which has the best fit 
to a set of unseen data, while avoiding the previously mentioned issues with the standard stopping 
approach. 

We can look at the impact of the new cross-validation on the NNPDF fits in closure tests. 
Fig. 1 shows the distances between the PDFs from a closure test fit using look-back cross-validation 
compared to a comparable fixed length fit. The distance is here defined as the absolute difference 
between the closure tests in units of the combined PDF uncertainty on each mean. The closure test 
fits shown here were both performed using pseudo-data generated using the MSTW2008 PDFs. 
The distance between the two fits is generally below five for all PDFs, indicating that there is only 
a slight difference above what we would expect based on statistical fluctuations. These results 
tell us that that the introduction of look-back cross-validation has only a small impact on the full 
NNPDF3.0 fit. This indicates that whatever overfitting is present in the NNPDF fits is small, 
possibly because of the large size of the dataset used, which has a high level of redundancy. 

However, there are still several reasons to include cross-validation in the final NNPDF3.0 
settings. It is possible that the tests we have used are not precise or comprehensive enough to 
detect all over-learning, and some could remain in the fit. Also, we would like to use the same 
methodology for all fits, including those with reduced datasets where over-leaiming may be a larger 
problem. Fig. 2 demonstrates this by showing the ratio between the closure test and MSTW 
PDFs in fits where only a random subset of the dataset is used. For large datasets (100% and 50%) 
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Effect of Cross-validation in Reduced Dataset Fits 



Figure 2: Ratio of j^s of closure test fits with and without cross-validation and the MSTW PDFs used to 
generate the closure test pseudo-data, for fits with reduced datasets. Each fit has a dataset which has been 
randomly reduced to a specified percentage (100%, 50%, 25% and 10%). 


LHC 13 TeV, NNPDF3.0 closure test versus input 
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LHC 7 TeV, NNPDF Closure test 
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Figure 3: (left) Comparison of closure test based on NNPDF3.0 pseudo-data to NNPDF3.0 for various 13 
TeV FHC processes, (right) Same but for VP + c at different rapidity values. 

there is little difference from including cross-validation, while in small datasets without cross- 
validation the is significantly smaller than the MSTW ideal, indicating over-learning. 

5. Closure testing LHC observables 

The closure test results presented in [2] generally looked at the reproduction of the underlying 
PDFs at the level of the to the included experimental data and of the PDFs themselves. However, 
we can also look at how well LHC observables are reproduced in closure tests. Fig. 3 shows calcu¬ 
lations of a variety of LHC observables using a closure test fit based on NNPDF3.0-derived pseudo¬ 
data, compared to similar values calculated with the NNPDF3.0 PDFs themselves. The left-hand 
plot compares inclusive cross-sections for vector-boson production (computed with Vrap [9]), top 
pair production (toph-h- [10]), and Higgs production by gluon-gluon fusion (ggHiggs [11]), while 
the right-hand plot shows the differential cross-section for + c production. In most cases the 
closure test result is consistent with the input PDF value at the one-sigma level. The largest dif¬ 
ference is seen for the ggH cross-section, where the closure test is about two standard deviations 
from the NNPDF3.0 value. This level of difference is unsurprising given the statistical fluctuations 
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Figure 4: (left) Comparison of the uncertainties in closure test fits performed with different levels of statisti¬ 
cal noise for the data-points of the 7 TeV ATLAS high-mass Drell-Yan dataset. The uncertainties are shown 
for each fit as a ratio to the central value of the fit. (right) Same, but for central points in the ATLAS 7 TeV 
inclusive jet dataset. 

in the closure test, as we expect, and have explicitly tested for the PDFs, that the one-sigma band 
contains the theory value in 68% of cases. 

We can use LHC observables to look at other closure test results. Fig. 4 compares the un¬ 
certainties obtained for ATLAS high-mass Drell-Yan and central inclusive jets in closure test fits 
performed with different levels of statistical noise. This allows the different contributions to the 
PDF uncertainty to be disentangled; for instance the uncertainty obtained in the fit without any 
statistical noise can be identified as the extrapolation uncertainty, due to the limited resolution of 
the data. Differences between successive levels of noise then provide an estimate of the functional 
uncertainty, from the existence different equally-probable PDFs, and finally data uncertainty, the 
propagation of the data uncertainties to the PDFs. The results in Fig. 4 are qualitatively similar to 
those found in [2] looking at the PDFs themselves, with the data uncertainty generally dominant, 
though with sizeable contributions from functional and extrapolation sources. A more precise de¬ 
scription of the different sources of uncertainties is given in the NNPDF3.0 paper [2]. 
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